Random sampling from distributed streams

ABSTRACT

Described herein are methods, systems, apparatuses and products for random sampling from distributed streams. An aspect provides a method for distributed sampling on a network with a plurality of sites and a coordinator, including: receiving at the coordinator a data element from a site of the plurality of sites, said data element having a weight randomly associated therewith deemed reportable by comparison at the site to a locally stored global value; comparing the weight of the data element received with a global value stored at the coordinator; and performing one of: updating the global value stored at the coordinator to the weight of the data element received; and communicating the global value stored at the coordinator back to the site of the plurality of sites. Other embodiments are disclosed.

FIELD OF THE INVENTION

The subject matter presented herein generally relates to optimal randomsampling from distributed streams of data.

BACKGROUND

For many data analysis tasks, it is impractical to collect all the dataat a single site and process it in a centralized manner. For example,data arrives at multiple network routers at extremely high rates, andqueries are often posed on the union of data observed at all therouters. Since the data set is changing, the query results could also bechanging continuously with time. This has motivated the continuous,distributed, streaming model. In this model there are k physicallydistributed sites receiving high-volume local streams of data. Thesesites talk to a central coordinator that has to continuously respond toqueries over the union of all streams observed so far. A challenge is tominimize the communication between the different sites and thecoordinator, while providing an accurate answer to queries at thecoordinator at all times.

A problem in this setting is to obtain a random sample drawn from theunion of all distributed streams. This generalizes the classic reservoirsampling problem to the setting of multiple distributed streams, and hasapplications to approximate query answering, selectivity estimation, andquery planning For example, in the case of network routers, maintaininga random sample from the union of the streams is valuable for networkmonitoring tasks involving the detection of global properties. Otherproblems on distributed stream processing, including the estimation ofthe number of distinct elements and heavy hitters, use random samplingas a primitive.

The study of sampling in distributed streams was initiated by priorwork. Consider a set of k different streams observed by the k sites withthe total number of current items in the union of all streams equal ton. Prior work has shown how k sites can maintain a random sample of sitems without replacement from the union of their streams using anexpected O((k+s)log s) messages between the sites and the centralcoordinator. The memory requirement of the central coordinator is smachine words, and the time requirement is O((k+s)log n). The memoryrequirement of the remote sites is a single machine word with constanttime per stream update. Prior work has also proven that the expectednumber of messages sent in any scheme is (k+s log(n/s)). Each message isassumed to be a single machine word, which can hold an integer ofmagnitude poly(kns).

BRIEF SUMMARY

One aspect provides a method for distributed sampling on a network witha plurality of sites and a coordinator, comprising: receiving at thecoordinator a data element from a site of the plurality of sites, saiddata element having a weight randomly associated therewith deemedreportable by comparison at the site to a locally stored global value;comparing the weight of the data element received with a global valuestored at the coordinator; and performing one of: updating the globalvalue stored at the coordinator to the weight of the data elementreceived; and communicating the global value stored at the coordinatorback to the site of the plurality of sites.

Another aspect provides a computer program product for distributedsampling on a network with a plurality of sites and a coordinator,comprising: a computer readable storage medium having computer readableprogram code embodied therewith, the computer readable program codecomprising: computer readable program code configured to receive, at thecoordinator, a data element from a site of the plurality of sites, saiddata element having a weight randomly associated therewith deemedreportable by comparison at the site to a locally stored global value;computer readable program code configured to compare the weight of thedata element received with a global value stored at the coordinator; andcomputer readable program code configured to perform one of: updatingthe global value stored at the coordinator to the weight of the dataelement received; and communicating the global value stored at thecoordinator back to the site of the plurality of sites.

A further aspect provides a system comprising: at least one processor;and a memory device operatively connected to the at least one processor;wherein, responsive to execution of program instructions accessible tothe at least one processor, the at least one processor is configured to:receive at the coordinator a data element from a site of the pluralityof sites, said data element having a weight randomly associatedtherewith deemed reportable by comparison at the site to a locallystored global value; compare the weight of the data element receivedwith a global value stored at the coordinator; and perform one of:updating the global value stored at the coordinator to the weight of thedata element received; and communicating the global value stored at thecoordinator back to the site of the plurality of sites.

The foregoing is a summary and thus may contain simplifications,generalizations, and omissions of detail; consequently, those skilled inthe art will appreciate that the summary is illustrative only and is notintended to be in any way limiting.

For a better understanding of the embodiments, together with other andfurther features and advantages thereof, reference is made to thefollowing description, taken in conjunction with the accompanyingdrawings. The scope of the invention will be pointed out in the appendedclaims.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

FIGS. 1-3 illustrate an example protocol for random sampling.

FIG. 4 illustrates an example coordinator and site network environment.

FIG. 5 illustrates an example method for random sampling fromdistributed streams.

FIG. 6 illustrates an example computing device.

DETAILED DESCRIPTION

It will be readily understood that the components of the embodiments, asgenerally described and illustrated in the figures herein, may bearranged and designed in a wide variety of different configurations inaddition to the described example embodiments. Thus, the following moredetailed description of the example embodiments, as represented in thefigures, is not intended to limit the scope of the claims, but is merelyrepresentative of those embodiments.

Reference throughout this specification to “embodiment(s)” (or the like)means that a particular feature, structure, or characteristic describedin connection with the embodiment is included in at least oneembodiment. Thus, appearances of the phrases “according to embodiments”or “an embodiment” (or the like) in various places throughout thisspecification are not necessarily all referring to the same embodiment.

Furthermore, the described features, structures, or characteristics maybe combined in any suitable manner in different embodiments. In thefollowing description, numerous specific details are provided to give athorough understanding of example embodiments. One skilled in therelevant art will recognize, however, that aspects can be practicedwithout certain specific details, or with other methods, components,materials, et cetera. In other instances, well-known structures,materials, or operations are not shown or described in detail to avoidobfuscation.

Regarding notation used herein, all algorithms are to the base 2 unlessotherwise specified. Throughout the description, when asymptoticnotation is used, the variable that is going to infinity is n, and s andk are functions of n.

A problem addressed by embodiments is that of distributed sampling.Namely, if there are k sites, each receiving a traffic stream, such asrouters on a network, together with a central coordinator, the centralcoordinator's job is to maintain a statistic on the union of the kstreams, such as the number of distinct elements, the heavy hitters oriceberg queries, et cetera. A challenge is that the stream isdistributed across these k sites, and the sites should have as littlecommunication as possible with the coordinator or with each other inorder to minimize communication bandwidth, central coordinatorcomputation time, et cetera.

A key primitive underlying statistics of interest is that of maintaininga random sample, or more generally s random samples, from the union ofthese streams. Given this, one may immediately obtain a protocol forfinding the heavy hitters or estimating the number of distinct elementswith related message (communication) complexity. Prior work establisheda protocol for doing this with O((k+s)log n) communication, where n isthe length of the union of the streams. Such a protocol was not known tobe optimal, and obtaining more practical protocols, which alsoasymptotically improve this complexity, is a problem addressed herein.

Accordingly, embodiments provide a new protocol that significantlyimproves the previous best protocol in the case that s is not too large,while k is large. Embodiments have the added benefit that thecoordinator only communicates with a site when the site initiates thecommunication. Thus, it is possible for sites to go offline withoutaffecting the correctness of the protocol, unlike in previous protocols.In many practical scenarios one first samples a small percentage of thestream, then runs whatever analysis is desired on the sampled stream.Embodiments give a more efficient way of doing this when the stream isdistributed.

Embodiments have asymptotic message complexity

${O\left( \frac{k\; {\log \left( {n/s} \right)}}{\log \left( {1 + \left( {k/s} \right)} \right)} \right)},$

which gives at least a log k factor savings in theory and often an evenlarger speed-up in practice. Further, embodiments are shown to beoptimal in message complexity, as no protocol can do better by more thana fixed constant factor (independent of n, s and k).

Essentially embodiments employ a protocol that hashes all stream tokensto a random number between [0,1] and maintains a minimum value. Clearlythis is a random sample from the union of the streams. The value of theminimum is also roughly 1/(# of distinct elements), so one obtains arough estimate this way. More refined estimates can be obtained byobtaining s samples.

Thus, an embodiment provides for sampling from distributed streams, aswell as a matching lower bound showing that the message complexity ofthis approach is optimal. In an embodiment, sensors assign randomnumbers to their received data elements from the data streams. Bycomparing the random number assigned for a given data element to acurrent global threshold value, the sensors determine when it may beappropriate to report a sample back to a coordinator node. For example,when the random number value is lower than the current global thresholdvalue, the sensor may report the sample back to the coordinator. Thecoordinator node in turn monitors received samples from the sensors, andnotifies them that a reported sample either establishes a new globalthreshold value (for example, a new minimum to be used for comparison),or indicates that the sensor that reported the sample should update itscurrent global threshold value.

An embodiment provides a process for distributed sampling using anexpected

$O\left( \frac{k\; {\log \left( {n/s} \right)}}{\log \left( {1 + \left( {k/s} \right)} \right)} \right)$

number of messages for continuously maintaining a random sample size ofs from k distributed data streams of total size n. Notice that if s<k/8,this number is

${O\left( \frac{k\; {\log \left( {n/s} \right)}}{\log \left( {k/s} \right)} \right)},$

while if s≧k/8, this number is O(s log(n/s)).

The memory requirement in the protocol at the central coordinator is smachine words, and the time requirement is

${O\left( \frac{k\; {\log \left( {n/s} \right)}}{\log \left( {1 + \left( {k/s} \right)} \right)} \right)}.$

The former is the same as in prior work, while the latter improves on aprior O((k+s)log n) time requirement. The remote sites store a singlemachine word and use a constant time per stream update, which is clearlyoptimal. This leads to a significant improvement in the messagecomplexity in the case when k is large. For example, for the basicproblem of maintaining a single random sample from the union ofdistributed streams (s=1), an embodiment leads to a factor of O(log k)decrease in the number of messages sent in the system over prior work.Table 1 illustrates a summary of the results for message complexity ofsampling without replacement.

TABLE 1 Upper Bound Lower Bound Embodiment Prior Work Embodiment PriorWork $s < \frac{k}{s}$$O\left( \frac{k\; {\log \left( {n/s} \right)}}{\log \left( {k/s} \right)} \right)$O(klogn)$\Omega \left( \frac{k\; {\log \left( {n/s} \right)}}{\log \left( {k/s} \right)} \right)$Ω(k + slogn) $s \geq \frac{k}{s}$ O(slog(n/s)) O(klogn) Ω(slog(n/s))Ω(slog(n/s))

The approach employed by embodiments is straightforward and reduces thecommunication necessary, as the coordinator communicates with a sensor(site) if the site initiates the communication. This is useful in asetting where a site may go offline, since it does not require theability of the site to receive broadcast messages.

Regarding a lower bond, it is described herein that for any constantq>0, any correct protocol must send Ω

$\Omega \left( \frac{k\; {\log \left( {n/s} \right)}}{\log \left( {1 + \left( {k/s} \right)} \right)} \right)$

messages with probability at least 1−q. This also yields a bound of Ω

$\Omega \left( \frac{k\; {\log \left( {n/s} \right)}}{\log \left( {1 + \left( {k/s} \right)} \right)} \right)$

on the expected message complexity of any correct protocol, showing theexpected number of messages sent by embodiments is optimal, up toconstant factors.

In addition to being quantitatively stronger than the lower bound ofprior work, the lower bound of embodiments is also qualitativelystronger because the lower bound in prior work is on the expected numberof messages transmitted in a correct protocol. However, this does notrule out the possibility that with large probability, much fewermessages are sent in the optimal protocol. In contrast, embodimentslower bound the number of messages that must be transmitted in anyprotocol 99% of the time. Since the time complexity of the centralcoordinator is at least the number of messages received, the timecomplexity of embodiments is also optimal.

The protocol used may also be modified in other embodiments to obtain arandom sample of s items from k distributed streams with replacement.Here, a protocol with

$O\left( {\left( {\frac{k}{\log \left( {2 + \left( {k/\left( {s\; \log \; s} \right)} \right)} \right)} + {\log (s)}} \right)\log \; n} \right)$

messages is provided, improving the O((k+s log s)log n) message protocolof prior work. The same improvement in the time complexity of thecoordinator is achieved.

As a corollary, a protocol for estimating the heavy hitters indistributed streams with the best known message complexity is provided.In this problem, a set of H items is sought such that if an element eoccurs at least an ε fraction of times in the union of streams, then e ∈H, and if e occurs less than an ε/2 fraction of times in the union ofthe streams, then e ∉ H. It is known that O(ε⁻² log n) random samplessuffice to estimate the set of heavy hitters with high probability, andthe previous best algorithm was obtained by plugging s=O(ε⁻² log n) intoa protocol for distributed sampling. Embodiments thus improve themessage complexity from O((k+ε⁻² log n)log n) to

${O\left( {\frac{k\; {\log \left( {ɛ\; n} \right)}}{\log \left( {ɛ\; k} \right)} + {ɛ^{- 2}\; {\log \left( {ɛ\; n} \right)}\log \; n}} \right)}.$

This can be significant when k is large compared to 1/ε.

It should be noted that the model used in this description is that of aproactive coordinator, as opposed to prior proposed reactivecoordinators, which do not continuously maintain an estimate of therequired aggregate, but only obtain an estimate when a query is posed tothe coordinator. Moreover, this description concerns the case ofnon-sliding window, and does not concern the case of sliding windows.

The description now turns to the figures. The illustrated exampleembodiments will be best understood by reference to the figures. Thefollowing description is intended only by way of example and simplyillustrates certain example embodiments representative of the invention,as claimed

Model

Referring to FIG. 4 generally, consider a system with k different sites(sensors 410, 420), numbered from 1 to k, each receiving a local streamof elements. Let S_(i) denote the stream observed at site i. There isone coordinator node 430, which is different from any of the sites. Thecoordinator does not observe a local stream, but all queries for arandom sample arrive at the coordinator. Let S=∪_(i=1) ^(n)S_(i) be theentire stream observed by the system, and let n=|S|. The sample size sis a parameter supplied to the coordinator 430 and to the sites 410, 420during initialization.

The task of the coordinator 430 is to continuously maintain a randomsample

of size min{n, s} consisting of elements chosen uniformly at randomwithout replacement from S. The cost of the protocol is the number ofmessages transmitted.

Assuming a synchronous communication model, where the system progressesin “rounds”, in each round, each site 410, 420 can observe one element(or none), and send a message to the coordinator 430, and receive aresponse from the coordinator. The coordinator 430 may receive up to kmessages in a round, and respond to each of them in the same round.Thus, in some respects, this model is similar to the model in previouswork. The case where multiple elements are observed per round is furtherdescribed herein.

The sizes of the different local streams at the sites 410, 420, theirorder of arrival and the interleaving of the streams at different sitescan all be arbitrary. The process utilized by embodiments makes noassumptions regarding these.

Process

The basic approach behind the process is essentially as follows. Eachsite 410, 420 associates a random “weight” with each element that itreceives. The coordinator 430 then maintains the set

of s elements with the minimum weights in the union of the streams atall times, and this is a random sample of S. The approach is similar tothe spirit in all centralized reservoir sampling. In a distributedsetting, an interesting aspect is at what times the sites 410, 420communicate with the coordinator 430, and vice versa.

In an embodiment, the coordinator 430 maintains u, which is the s-thsmallest weight so far in the system, as well as the sample,

consisting of all the elements that have weight no more than u. Eachsite 410, 420 need only maintain a single value u_(i), which is thesite's view of the s-th smallest weight in the system so far. Note thatit is too expensive to keep the view of each site synchronized with thecoordinator's view at all times—to see this, note that the value of thes-th smallest weight changes O(s log(n/s)) times, and updating everysite each time the s-th minimum changes takes a total of O(sk log(n/s))messages.

In an embodiment, when site i sees an element with a weight smaller thanu_(i), it sends it to the central coordinator 430. The coordinator 430updates u and

if needed, and then replies back to i with the current value of u, whichis the true minimum weight in the union of all streams. Thus, each timea site communicates with the coordinator 430, it either makes a changeto the random sample, or, at least, gets to refresh its view of u.

FIGS. 1-2 generally illustrate the process at each site. The process atthe coordinator is generally illustrated at FIG. 3. The following twolemmas establish the correctness of the process.

Lemma 1. Let n be the number of elements in S so far. (1) If n≦s, thenthe set

at the coordinator contains all the (e,w) pairs seen at all the sites sofar. (2) If n>s, then

at the coordinator consists of the s(e,w) pairs such that the weights ofthe pairs in

are the smallest weights in the stream so far. The proof of this lemmais immediate from the description of the protocol.

Lemma 2. At the end of each round, sample

at the coordinator consists of a uniform random sample size of min{n, s}chosen without replacement from S.

Proof. In case n≦s, it is known that

contains every element of S. In case n>s, from Lemma 1, it follows that

consists of s elements with the smallest weights from S. Since theweights are assigned randomly, each element in S has a probability of

$\frac{s}{n}$

of belonging in

, showing that this is a uniform random sample. Since an element canappear no more than once in the sample, this is a sample chosen withoutreplacement.

Analysis

Now described are the message complexity and the maintenance of a randomsample. For the sake of description, execution of the process is dividedinto “epochs”, where each epoch consists of a sequence of rounds. Theepochs are defined inductively. Let r>1 be a parameter, which will befixed later. Recall that u is the s-th smallest weight so far in thesystem (if there are fewer than s elements so far, u=1). Epoch 0 is theset of all rounds from the beginning of execution unit (and including)the earliest round where u is

$\frac{1}{r}$

or smaller. Let m_(i) denote the value of u at the end of epoch i-1.Then epoch i consists of all rounds subsequent to epoch i−1 until (andincluding) the earliest round when u is

$\frac{m_{i}}{r}$

or smaller. Note that the process does not need to be aware of theepochs, and the description of epochs is only used for ease ofdescription.

If the original distributed system process illustrated in FIGS. 3 and 2are referred to as Process A, for the analysis, a slightly differentdistributed process, Process B is described. Process B is identical toProcess A except for the fact that at the beginning of each epoch, thevalue u is broadcast by the coordinator to all sites.

While Process A is natural, Process B is easier to analyze. First, it isnoted that the same inputs, the value of u (and

) at the coordinator, at any round in Process B is identical to thevalue of u (and

) at the coordinator in Process A at the same round. Hence, thepartitioning of rounds into epochs is the same for both processes, for agiven input. The correctness of Process B follows from the correctnessof Process A. The only difference between them is in the total number ofmessages sent. In Process B, there is a property that for all i from 1to k, u_(i)=u at the beginning of each epoch (though this is notnecessarily true throughout the epoch), and for this, Process B has topay a cost of at least k messages in each epoch.

Lemma 3. The number of messages sent by Process A for a set of streamsS_(j), j-1 is never more than twice the number of messages sent byProcess B for the same input.

Proof. Consider site v in a particular epoch i. In Process B, v receivesm_(i) at the beginning of the epoch through a message from thecoordinator. In Process A, v may not know m_(i) at the beginning ofepoch i. Two cases are considered.

Case I: v sends a message to the coordinator in epoch i in Process A. Inthis case, the first time v sends a message to the coordinator in thisepoch, v will receive the current value of u , which is smaller than orequal to m_(i). This communication costs two messages, one in eachdirection. Henceforth, in this epoch, the number of messages sent inProcess A is no more than those sent in Process B. In this epoch, thenumber of messages transmitted to/from v in Process A is at most twicethe number of messages as in Process B, which has at least onetransmission from the coordinator to site v.

Case II: v did not send a message to the coordinator in this epoch, inProcess A. In this case, the number of messages sent in this epochto/from site v in Process A is smaller than in Process B.

Let ξ denote the total number of epochs.

Lemma 4. If r≧2,

${E{\xi }} \leq {\left( \frac{\; {\log \left( {n/s} \right)}}{\; {\log \; s}} \right) + 2}$

Proof

${{Let}\mspace{14mu} z} = {\left( \frac{\; {\log \left( {n/r} \right)}}{\; {\log \; r}} \right).}$

First, it is noted that in each epoch, u decreases by a factor of atleast r. Thus, after (z+l) epochs, u is no more than

$\frac{1}{r^{z + }} = {\left( \frac{r}{n} \right){\frac{1}{r^{}}.}}$

Thus,

${\Pr \left\lbrack {\xi \geq {z + }} \right\rbrack} \geq {\Pr \left\lbrack {u \leq {\left( \frac{s}{n} \right)\frac{1}{r^{}}}} \right\rbrack}$

Let Y denote the number of elements (out of n) that have been assigned aweight of

$\frac{s}{n\; r^{}}$

or less. Y is a binomial random variable with expectation

$\frac{s}{r^{}}.$

Note that if

${u \leq \frac{s}{n\; r^{}}},$

it must be true that Y>s.

${\Pr \left\lbrack {\xi \geq {z + }} \right\rbrack} \leq {\Pr \left\lbrack {Y \geq s} \right\rbrack} \leq {\Pr \left\lbrack {Y \geq {r^{}{E\lbrack Y\rbrack}}} \right\rbrack} \leq \frac{1}{r^{}}$

where Markov's inequality has been used.

Since ξ takes only positive integral values,

${E\lbrack\xi\rbrack} = {{\sum\limits_{i > 0}^{\;}{\Pr \left\lbrack {\xi \geq i} \right\rbrack}} = {{{\sum\limits_{i = 1}^{z}\; {\Pr \left\lbrack {\xi \geq i} \right\rbrack}} + {\sum\limits_{ \geq 1}^{\;}{\Pr \left\lbrack {\xi \geq {z + }} \right\rbrack}}} \leq {z + {\sum\limits_{ \geq 1}^{\;}\frac{1}{r^{}}}} \leq {z + \frac{1}{1 - {1/r}}} \leq {z + 2}}}$

where r≧2.

Let n_(j) denote the total number of elements that arrived in epoch j,thus n=Σ_(j=0) ^(ξ−1)n_(j). Let μ denote the total number of messagessent during the entire execution. Let μ_(i) denote the total number ofmessages sent in epoch i. Let X_(i) denote the number of messages sentfrom the sites to the coordinator in epoch i, μ_(i) is the sum of twoparts, (1) k messages sent by the coordinator at the start of the epoch,and (2) two times the number of messages sent from the sites to thecoordinator.

$\begin{matrix}{\mu_{i} = {k + {2X_{i}}}} & (1) \\{\mu = {{\sum\limits_{j = 0}^{\xi - 1}\; \mu_{i}} = {{\xi \; k} + {2{\sum\limits_{j = 0}^{\xi - 1}\; X_{j}}}}}} & (2)\end{matrix}$

Consider epoch i . For each element j=1 . . . n_(i) in epoch i, a 0-1random variable Y_(j) is defined as follows. Y_(j)=1 if observing thej-th element in the epoch resulted in a message being sent to thecoordinator, and Y_(j)=0 otherwise.

$\begin{matrix}{X_{i} = {\sum\limits_{j = 1}^{n_{i}}\; Y_{j}}} & (3)\end{matrix}$

Let F(η,α) denote the event n_(i)=η and m_(i)=α. The following Lemmagives a bound on a conditional probability that is described later.

Lemma 5. For each j=1 . . . n_(i)−1

${\Pr \left\lbrack {Y_{j} = {1{F\left( {\eta,\alpha} \right)}}} \right\rbrack} \leq \frac{\alpha - {\alpha/r}}{1 - {\alpha/r}}$

Proof Suppose that the j-th element in the epoch was observed by site v.For this element to cause a message to be sent to the coordinator, therandom weight assigned to it must be less than u_(v) at that instant.Conditioned on m_(i)=α, u_(v) is no more than α.

Note that in this lemma, the last element that arrived in epoch i isexcluded. Thus, the weight assigned to element j must be greater thanα/r . Thus, the weight assigned to j must be a uniform random number inthe range (α/r, 1). The probaility this weight is less than the currentvalue of u_(v) is no more than

$\frac{\alpha - {\alpha/r}}{1 - {\alpha/r}},$

since u_(v)≦α.

Lemma 6. For each epoch i

E[X _(i)]≦1+2rs

Proof. The expectation conditioned on F(η,α) is first obtained, and thenthe conditioning is removed. From Lemma 5 and Equation 3:

${E\left\lbrack {F_{i}{F\left( {\eta,\alpha} \right)}} \right\rbrack} \leq {1 + {E\left\lbrack {\left( {\sum\limits_{j = 1}^{\eta - 1}\; Y_{j}} \right){F\left( {\eta,\alpha} \right)}} \right\rbrack}} \leq {1 + {\sum\limits_{j = 1}^{\eta - 1}{E\left\lbrack {Y_{j}{F\left( {\eta,\alpha} \right)}} \right\rbrack}}} \leq {1\left( {\eta - 1} \right)\frac{\alpha - {\alpha/r}}{1 - {\alpha/r}}}$

Using r≧2 and α≦1 gives E[X_(i)|F(η,α)]≦1+2(η−1)α.

Next considered is the conditional expectation E[X_(i)|m_(i)=α].

${E\left\lbrack {{X_{i}m_{i}} = \alpha} \right\rbrack} = {{\sum\limits_{\eta}^{\;}{{\Pr \left\lbrack {n_{i} = {{\eta m_{i}} = \alpha}} \right\rbrack}{E\left\lbrack {{{X_{i}n_{i}} = \eta},{m_{i} = \alpha}} \right\rbrack}}} \leq {\sum\limits_{\eta}^{\;}{{\Pr \left\lbrack {n_{i} = {{\eta {}m_{i}} = \alpha}} \right\rbrack}\left( {1 + {2\left( {\eta - 1} \right)\alpha}} \right)}} \leq {E\left\lbrack {{{1 + {2\left( {n_{i} - 1} \right)\alpha}}m_{i}} = \alpha} \right\rbrack} \leq {1 + {2{\alpha \left( {{E\left\lbrack {{n_{i}m_{i}} = \alpha} \right\rbrack} - 1} \right)}}}}$

Using Lemma 7 gives

${E\left\lbrack {{X_{i}m_{i}} = \alpha} \right\rbrack} \leq {1 + {2{\alpha \left( {\frac{rs}{\alpha} - 1} \right)}}} \leq {1 + {2{rs}}}$

since E[X _(i)]=E[E[X_(i)|m_(i)=α]], E[E_(i)]≦E[1+2rs]=1+2rs.

Lemma 7.

${E\left\lbrack {{X_{i}m_{i}} = \alpha} \right\rbrack} = \frac{rs}{\alpha}$

Proof Recall that n_(i), the total number of elements in epoch i, is thenumber of elements observed until the s-th minimum in the streamdecreases to a value that is less than or equal to α/r.

Let Z denote a random variable that equals the number of elements to beobserved from the start of epoch i until s new elements are seen, eachof whose weight is less than or equal to α/r. Clearly, conditioned onm_(i)=α, it must be true that n_(i)≦Z. For j=1 to s, let Z_(j) denotethe number of elements observed from the state when (j−1) elements havebeen observed with weights that are less than α/r until the state when jelements have been observed with weights less than α/r. Z_(j) is ageometric random variable with parameter α/r.

Having Z=Σ_(j−1) ^(s)Z_(j) and

${{E\lbrack Z\rbrack} = {{\sum\limits_{j = 1}^{8}\; {E\left\lbrack Z_{j} \right\rbrack}} = \frac{rs}{\alpha}}},$

and since E[n_(i)|m_(i)=α]≦E[Z], the lemma follows.

Lemma 8.

${E\lbrack\mu\rbrack} \leq {\left( {k + {4{rs}} + 2} \right)\left( {\frac{\; {\log \left( {n/s} \right)}}{\; {\log \; r}} + 2} \right)}$

Proof Using Lemma 6 and Equation 1, the expected number of messages inepoch i is:

[μ_(i) ]≦k+2(2rs+1)=k+2+4rs

Note that the above is independent of i. The proof follows from Lemma 4,which gives an upper bound on the expected number of epochs.

Theorem 1. The expected message complexity E[μ] of the process is asfollows.

${{I\text{:}\mspace{14mu} {If}\mspace{14mu} s} \geq \frac{k}{8}},{{{then}\mspace{14mu} {E\lbrack\mu\rbrack}} = {O\left( {s\; {\log \left( \frac{n}{s} \right)}} \right)}}$${{{II}\text{:}\mspace{14mu} {If}\mspace{14mu} s} < \frac{k}{8}},{{{then}\mspace{14mu} {E\lbrack\mu\rbrack}} = {O\left( \frac{\; {k\; {\log \left( \frac{n}{s} \right)}}}{\; {\log \; \frac{k}{s}}} \right)}}$

Proof It is noted that the upper bounds on E[μ] in Lemma 8 hold for anyvalue of r≧2.

Case I:

$s \geq {\frac{k}{8}.}$

In this case, r is set to equal 2. From Lemma 8,

${{E\lbrack\mu\rbrack} \leq {\left( {{8s} + {8s} + 2} \right)\left( \frac{\; {\log \left( {n/s} \right)}}{\; {\log \; 2}} \right)}} = {{\left( {{16s} + 2} \right){\log \left( \frac{n}{s} \right)}} = {O\left( {s\; {\log \left( \frac{n}{s} \right)}} \right)}}$

Case II:

$s \geq {\frac{k}{8}.}$

Minimizing the expression setting

$r = \frac{k}{4s}$

yields

${E\lbrack\mu\rbrack} = {{O\left( \frac{\; {k\; {\log \left( \frac{n}{s} \right)}}}{\; {\log \; \frac{k}{s}}} \right)}.}$

Lower Bound

Theorem 2. For any constant q, 0<q<1, any correct protocol must send

$\Omega \left( \frac{k\; {\log \left( {n/s} \right)}}{\; {\log \left( {1 + \left( {k/s} \right)} \right)}} \right)$

messages with probability at least 1−q, where the probability is takenover the protocol's internal randomness.

Proof. Let β=(1+(k/s)). Defining

$e = {\theta \left( \frac{\log \left( {n/s} \right)}{\log \left( {1 + \left( {k/s} \right)} \right)} \right)}$

epochs as follows: in the i-th epoch, i ∈{0, 1, 2, . . . , e−1}, thereare β^(i−1) global stream updates, which can be distributed among the kservers in an arbitrary way.

A distribution on orderings of the stream updates is considered. Namely,on a totally-ordered stream 1,2,3 . . . , it of it updates, and in thei-th epoch, the β^(i−1)k updates are randomly assigned among the kservers, independently for each epoch. Let the randomness used for theassignment in the i-th epoch be denoted σ_(i).

Considering the global stream of updates 1,2,3 . . . n, suppose a sampleset ρ of s is maintained without replacement. Letting ρ_(i) denote arandom variable indicating the value of ρ after seeing i updates in thestream, the following lemma is used about random sampling.

Lemma 9. For any constant q>0, there is a constant C′=C′(q)>0 for which

-   -   changes at least C's=log(n/s) times with probability at least        1−q, and    -   if s<k/8 and k=w(1) and e=ω(1), then with probability at least        1−q/2, over the choice of {        }, there are at least (1−(q/8)e epochs for which the number of        times        changes in the epoch is at least C's=log(1+(k/s)).

Proof Consider the stream 1, 2,3 . . . , n of updates. In the classicalreservoir sampling algorithm,

is initialized to 1,2,3 . . . , s. Then, for each i>s, the i-th elementis included in the current sample set

with probability s/i, in which case a random item in

is replaced with i.

For the first part of Lemma 9, let X_(i) be an indicator random variableif i causes

to change. Let X_(i)=ρ_(i=1) ^(n)X_(i). Hence, E[X_(i)]=1 for 1≦i≦s, andE[X_(i)]=s/i for all i>s. Then E[X]=s+Σ_(i=s+1)^(n)s/i=s+s(H_(n)−H_(s)), where H_(i)=lni+O(1) is the i-th Harmonicnumber. Then all of the X_(i) are independent indicator randomvariables. It follows by a Chernoff bound that

${\Pr \lbrack X\rbrack} < {E\left\lbrack {X/2} \right\rbrack} \leq {\exp \left( {{- {E\lbrack X\rbrack}}/8} \right)} \leq {\exp\left( {{- {\left( {s + {s\; {\ln \left( {n/s} \right)}} - {{O(1)}/8}} \right).}}\mspace{79mu} \leq {{\exp \left( {- {\Theta (s)}} \right)}\left( \frac{s}{n} \right)^{s/8}}} \right.}$

There is an absolute constant n₀ so that for any n≧₀, this probabilityis less than any constant q, and so the first part of Lemma 9 follows.

For the second part of Lemma 9, consider the i-th epoch, i>0, whichcontains β^(i−1)k consecutive updates. Let Y_(i) be the number ofchanges in this epoch. Then E[Y_(i)]≧s(H_(β) ^(i−1) _(k)−H_(β) ^(i−2)_(k) 0=Ω(slogβ). Note that β≧9 since s <k /8 by assumption, and β=1+k/s.Since Y, can be written as a sum of independent indicator randomvariables, by a Chernoff bound,

${\Pr \left\lbrack {Y_{i} < {{E\left\lbrack Y_{i} \right\rbrack}/2}} \right\rbrack} \leq {\exp \left( {{- {E\left\lbrack Y_{i} \right\rbrack}}/8} \right)} \leq {\exp \left( {- {\Omega \left( {s\; \log \; \beta} \right)}} \right)} \leq {\frac{1}{\beta \; {\Omega (s)}}.}$

Hence, the expected number of epochs i for which Y_(i)<E[Y_(i)]/2 is atmost

${\sum\limits_{i = 1}^{e}\; \frac{1}{\beta^{\Omega {(s)}}\;}},$

which is o (e) since β≧9 and e=ω(1). By a Markov bound, with probabilityat least 1−q/2, at most o(e/q) =o(e) epochs i satisfy Y_(i)<E[Y_(i)]/2.It follows that with probability at least 1−q/2, there are at least(1−q/8)e epochs i for which the number Y_(i) of changes in the epochs iis at least E[Y_(i)]/2≧C′slogβ=C′slog(1+k/s)) for a constant C′>0, asdesired.

Corner Cases

When s≧k/8, the statement of Theorem 2 gives a lower bound ofΩ(slog(n/s)). In this case Theorem 2 follows immediately from the firstpart of Lemma 9 since these changes in

must be communicated to the central coordinator. Hence, in what followsit can be assumed that s<k/8. Notice also that if k=O(1), then

${\frac{k\; {\log \left( {n/s} \right)}}{\; {\log \left( {1 + \left( {k/s} \right)} \right)}} = {O\left( {s\; {\log \left( {n/s} \right)}} \right)}},$

and so the theorem is independent of k, and it follows simply by thefirst part of Lemma 9. Notice also that if e=O(1), then the statement ofTheorem 2 amounts to providing an Ω(kk) lower bound, which followstrivially since every site must send at least one message. Thus, in whatfollows, the second part of Lemma 9 may be applied.

Main Case: Let C>0 be a sufficiently small constant, depending on q, tobe determined below. Let II be a possibly randomized protocol, whichwith probability of at least q, sends at most Cke messages. II can notbe a correct protocol.

Let τ denote the random coin tosses of II, that is, the concatenation ofrandom strings of all k sites together with that of the centralcoordinator.

Let ε be the event that H sends less than Cke messages. By assumption,Pr_(τ)[ε]≧q. Hence, it is also the case that

${\begin{matrix}\Pr \\{\tau,\left\{ _{i} \right\},\left\{ \sigma_{i} \right\}}\end{matrix}\lbrack ɛ\rbrack} \geq q$

For a sufficiently small constant C′>0 that may depend on q, let

be the event that there are at least (1−(q/8))e epochs for which thenumber of times

changes in the epoch at least C′slog(1+k/s)). By the second part ofLemma 9,

${\begin{matrix}\Pr \\{\tau,\left\{ _{i} \right\},\left\{ \sigma_{i} \right\}}\end{matrix}\lbrack\mathcal{F}\rbrack} \geq {1 - {q/2}}$

It follows that there is a fixing of τ=τ′ as well as a fixing of

to

for which

occurs and

${{\Pr\limits_{\{\sigma_{i}\}}\left\lbrack {{{ɛ\tau} = \tau^{\prime}},{\left( {_{0},_{1},\ldots \mspace{14mu},_{e}} \right) = \left( {_{0}^{\prime},_{1}^{\prime},\ldots \mspace{14mu},_{e}^{\prime}} \right)}} \right\rbrack} \geq {q - {q/2}}} = {q/2}$

Notice that the three (sets of) random variables τ, {

}, and {σ_(i)} are independent, and so in particular, {σ_(i)} is stilluniformly random given this conditioning.

By a Markov argument, if event ε occurs, then there are at least(1−(q/8))e epochs for which at most (8/q))·C·k messages are sent. Ifevents ε and

both occur, then by a union bound, there are at least (1−(q/4))e epochsfor which at most (8/q))·C·k messages are sent and

changes in the epoch at least C′slog(1+k/s)) times. Such an epoch isreferred to herein as balanced.

Let i* be the epoch which is most likely to be balanced, over the randomchoices of {σ_(i)}, conditioned on τ=τ′ and (

)=(

). Since at least (1−(q/4))e epochs are balanced if ε and

occur, and conditioned on (

)=(

) event

does occur, and E occurs with probability at least q/2 given thisconditioning, it follows that

${{\Pr\limits_{\{\sigma_{i}\}}\left\lbrack {{{{i^{*}\mspace{14mu} {is}\mspace{14mu} {balanced}}\tau} = \tau^{\prime}},{\left( {_{0},_{1},\ldots \mspace{14mu},_{e}} \right) = \left( {_{0}^{\prime},_{1}^{\prime},\ldots \mspace{14mu},_{e}^{\prime}} \right)}} \right\rbrack} \geq {{q/2} - {q/4}}} = {q/4}$

The property of i* being balanced is independent of σ_(j) for j≠i, so

${\Pr\limits_{\{\sigma_{i}\}}\left\lbrack {{{{i^{*}\mspace{14mu} {is}\mspace{14mu} {balanced}}\tau} = \tau^{\prime}},{\left( {_{0},_{1},\ldots \mspace{14mu},_{e}} \right) = \left( {_{0}^{\prime},_{1}^{\prime},\ldots \mspace{14mu},_{e}^{\prime}} \right)}} \right\rbrack} \geq {q/4}$

Since C′slog(1+k/s))>0 and

changes at least C′slog (1+k/s)) times in epoch i*,

changes at least once in epoch i*. Suppose the first update in theglobal stream at which

changes is the j-th update. In order for i* to be balanced for at leasta q/4 fraction of the σ_(i), there must be at least qk/4 differentservers which receive j*, for which II sends a message. In particular,since II is deterministic conditioned on τ, at least qk/4 messages mustbe sent in the i-th epoch. But i* was chosen so that at most (8/q))·C·kmessages are sent, which is a contradiction for C<q²/32.

It follows that a contradiction has been reached, and it also followsthat Cke messages must be sent with probability at least 2−q. Since

${{Cke} = {\Omega \left( \frac{k\; {\log \left( {n/s} \right)}}{\; {\log \left( {1 + \left( {k/s} \right)} \right)}} \right)}},$

this completes the proof

Sampling with Replacement

Now described is an example process to maintain a random sample of sizes with replacement from S. The basic idea is to run in parallel s copiesof the single item sampling process described above. Done naively, thiswill lead to a message complexity of

${O\left( {{sk}\frac{\; {\log \; n}}{\log \; k}} \right)}.$

Embodiments improve on this using the following.

Viewing the distributed streams as s logical streams, S^(i), i=1 . . .s. Each S^(i) is identical to S, but the process assigns independentweights to the different copies of the same element in the differentlogical streams. Let ω^(i)(e) denote the weight assigned to element e inS^(i). ω^(i)(e) is a random number between 0 and 1. For each i=1 . . .s, the coordinator maintains the minimum weight, ω^(i), among allelements in S^(i), and the corresponding element.

Let β/max_(i=1) ^(s)ω^(i); β is maintained by the coordinator. Each sitej maintains β_(j), a local view β, which is always greater than or equalto β. Whenever a logical stream element at site j has a weight less thanβ, the site sends it to the coordinator, receives in response thecurrent value of β, and updates β_(j). When a random sample is requestedat the coordinator, it returns the set of all minimum weight elements inall s logical streams. It can easily be seen that this process iscorrect, and at all times returns a random sample of size s selectedwith replacement. The main optimization relative to the naïve approachdescribed above is that when a site sends a message to the coordinator,it receives β, which provides partial information about all ω^(i)s. Thisprovides a substantial improvement in the message complexity and leadsto the following bounds

Theorem 3. The above process continuously maintains a sample of size swith replacement from S, and its expected message complexity is O(s logsloge) in case k≦2s logs and

$O\left( {k\frac{\; {\log \; n}}{\log \left( \frac{k}{s\; \log \; s} \right)}} \right)$

in case k>2s logs.

Proof A sketch of the proof is provided. The analysis of the messagecomplexity is similar to the case of sampling without replacement. Theanalysis is sketched here and the details are omitted for the sake ofcompact description. The execution is divided into epochs, where inepoch i the value of β at the coordinator decreases by at least a factorof r (a parameter to be determined). Let ξ denote the number of epochs.It can be seen that

${\lbrack E\rbrack \xi} = {{O\left( \frac{\; {\log \; n}}{\log \; s} \right)}.}$

In epoch i, let X_(i) denote the number of messages sent from the sitesto the coordinator in the epoch, m_(i) denote the value of β at thebeginning of the epoch, and n_(i) denote the number of elements S inthat arrived in the epoch.

The n_(i) elements in epoch i give rise to sn_(i) logical elements, andeach logical element has a probability of no more than m_(i) ofresulting in a message to the coordinator. Similar to the proof of Lemma6, it can be shown using conditional expectations that E[X_(i)|≦rs logs. Thus, the expected total number of messages in epoch i is bonded by(k+2rs log s), and in the entire execution is

${O\left( {\left( {k + {2{rs}\; \log \; s}} \right)\frac{\; {\log \; n}}{\log \; r}} \right)}.$

By choosing r=2 for the case k≦(2s log s), and r=k/(slog s) for the casek>(2s log s), the desired result is obtained.

Thus, in view of the foregoing description, and now referring again toFIG. 4, it can be appreciated that in an environment where a pluralityof sensors 410, 420, such as network routers, handle streaming dataelements, such as IP addresses, an embodiment provides for sampling fromthe distributed streams by maintaining a random sample at thecoordinator 430 with a minimum use of bandwidth and message complexity.At the outset, all sensors 410, 420 assign a random number within apredetermined range (for example, 0-1) to a data element received andreport it as an initial sample to the coordinator 430. The coordinator430 in turn determines a global value (for example, a minimum or maximumvalue from the plurality of random numbers received thus far) from amongthe random numbers. The coordinator responds to those sensors, forexample sensor 410, that have not reported the data element with thelowest random value. The coordinator's current lowest value (globally)is a random sample from the plurality of streams, as reported by sensors410, 420.

Referring to FIG. 5, once a sensor (for example, sensor 410) hasreceived a communication from the coordinator indicating the globalvalue (for example, a global minimum value), the sensor continuesreceive data elements from the data stream 501 and continues to assignrandom numbers to the received data elements 502. By comparing 503 therandom number assigned for a given data element to a current globalthreshold value, the sensors may determine 504 when it is appropriate toreport a sample back to a coordinator node. The sensor may determinethat the current random number value is not reportable 505, and maymaintain its current sample 506.

However, when the random number value is reportable, for example beinglower than the current global threshold value (last value received fromthe coordinator), the sensor may determine that it is appropriate reportthe sample back to the coordinator 507. In such a case, the sensor maystore the new (locally) minimum value 508 and report it back to thecoordinator 509.

The coordinator node in turn continues to compare such received samplevalues from the sensors 510, and determines if these reported valuesestablish a new global value (global minimum in this example) 511. If so512, the coordinator establishes a new global value 513. However, if thereported sample is determined to not establish a new global value (isnot below the current global minimum value in this example) 514, thecoordinator communicates with the reporting sensor the current globalminimum such that the reporting sensor may update its current globalthreshold value 515.

Referring to FIG. 6, it will be readily understood that embodiments maybe implemented using any of a wide variety of devices or combinations ofdevices, for example for implementing sensor and/or coordinatorfunctionality as described herein. An example device that may be used inimplementing embodiments includes a computing device in the form of acomputer 610. In this regard, the computer 610 may execute programinstructions configured to provide for random sampling from distributedstreams, and perform other functionality of the embodiments, asdescribed herein.

Components of computer 610 may include, but are not limited to, at leastone processing unit 620, a system memory 630, and a system bus 622 thatcouples various system components including the system memory 630 to theprocessing unit(s) 620. The computer 610 may include or have access to avariety of computer readable media. The system memory 630 may includecomputer readable storage media in the form of volatile and/ornonvolatile memory such as read only memory (ROM) and/or random accessmemory (RAM). By way of example, and not limitation, system memory 630may also include an operating system, application programs, otherprogram modules, and program data.

A user may interface with (for example, enter commands and information)the computer 610 through input devices 640. A monitor or other type ofdevice can also be connected to the system bus 622 via an interface,such as an output interface 650. In addition to a monitor, computers mayalso include other peripheral output devices. The computer 610 mayoperate in a networked or distributed environment using logicalconnections (network interface 660) to other remote computers ordatabases (remote device(s) 670), such as for communication betweensensors and coordinator. The logical connections may include a network,such local area network (LAN) or a wide area network (WAN), but may alsoinclude other networks/buses.

As will be appreciated by one skilled in the art, aspects may beembodied as a system, method or computer program product. Accordingly,aspects of the present invention may take the form of an entirelyhardware embodiment, or an embodiment including software (includingfirmware, resident software, micro-code, et cetera) that may allgenerally be referred to herein as a “circuit,” “module” or “system.”Furthermore, aspects of the present invention may take the form of acomputer program product embodied in at least one computer readablemedium(s) having computer readable program code embodied thereon.

Any combination of at least one computer readable medium(s) may beutilized. A computer readable storage medium may be, for example, butnot limited to, an electronic, magnetic, optical, electromagnetic,infrared, or semiconductor system, apparatus, or device, or any suitablecombination of the foregoing. More specific examples (a non-exhaustivelist) of the computer readable storage medium would include thefollowing: an electrical connection having at least one wire(s), aportable computer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), an optical fiber, a portable compact disc read-onlymemory (CD-ROM), an optical storage device, a magnetic storage device,or any suitable combination of the foregoing. In the context of thisdocument, a computer readable storage medium may be any tangible ornon-signal medium that can contain or store a program for use by or inconnection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wireline, optical fiber cable, RF, etc., or any suitable combination ofthe foregoing.

Computer program code for carrying out operations for embodiments may bewritten in any combination of programming languages, including an objectoriented programming language such as Java, Smalltalk, C++ or the likeand conventional procedural programming languages, such as the “C”programming language or similar programming languages. The program codemay execute entirely on a first computer, partly on the first computer,as a stand-alone software package, partly on the first computer andpartly on a remote computer or entirely on the remote computer orserver. In the latter scenario, the remote computer may be connected tothe first computer through any type of network, including a local areanetwork (LAN) or a wide area network (WAN), or the connection may bemade to an external computer (for example, through the Internet using anInternet Service Provider).

Embodiments are described with reference to figures of methods,apparatus (systems) and computer program products according toembodiments. It will be understood that portions of the figures can beimplemented by computer program instructions. These computer programinstructions may be provided to a processor of a general purposecomputer, special purpose computer, or other programmable dataprocessing apparatus to produce a machine, such that the instructions,which execute via the processor of the computer or other programmabledata processing apparatus, create means for implementing thefunctions/acts specified.

These computer program instructions may also be stored in a computerreadable medium that can direct a computer, other programmable dataprocessing apparatus, or other devices to function in a particularmanner, such that the instructions stored in the computer readablemedium produce an article of manufacture including instructions whichimplement the function/act specified. The computer program instructionsmay also be loaded onto a computer, other programmable data processingapparatus, or other devices to cause a series of operational steps to beperformed on the computer, other programmable apparatus or other devicesto produce a computer implemented process such that the instructionswhich execute on the computer or other programmable apparatus provideprocesses for implementing the functions/acts specified.

This disclosure has been presented for purposes of illustration anddescription but is not intended to be exhaustive or limiting. Manymodifications and variations will be apparent to those of ordinary skillin the art. The example embodiments were chosen and described in orderto explain principles and practical application, and to enable others ofordinary skill in the art to understand the disclosure for variousembodiments with various modifications as are suited to the particularuse contemplated.

Although illustrated example embodiments have been described herein withreference to the accompanying drawings, it is to be understood thatembodiments are not limited to those precise example embodiments, andthat various other changes and modifications may be affected therein byone skilled in the art without departing from the scope or spirit of thedisclosure.

What is claimed is:
 1. A method for distributed sampling on a networkwith a plurality of sites and a coordinator, comprising: receiving atthe coordinator a data element from a site of the plurality of sites,said data element having a weight randomly associated therewith deemedreportable by comparison at the site to a locally stored global value;comparing the weight of the data element received with a global valuestored at the coordinator; and performing one of: updating the globalvalue stored at the coordinator to the weight of the data elementreceived; and communicating the global value stored at the coordinatorback to the site of the plurality of sites.
 2. The method of claim 1,wherein the weight randomly associated with the data element at the siteis deemed reportable by the site if the weight randomly associated withthe data element is less than the locally stored global value at thesite.
 3. The method of claim 2, wherein the communicating the globalvalue stored at the coordinator back to the site of the plurality ofsites is performed responsive to a determination that the weightrandomly associated with the data element received is greater than theglobal value stored at the coordinator.
 4. The method of claim 3,wherein the updating the global value stored at the coordinator to theweight of the data element received is performed responsive to adetermination that the weight randomly associated with the data elementreceived is less than the global value stored at the coordinator.
 5. Themethod of claim 1, wherein the weight randomly associated with the dataelement at the site is deemed reportable by the site if the weightrandomly associated with the data element is greater than the locallystored global value.
 6. The method of claim 5, wherein the communicatingthe global value stored at the coordinator back to the site of theplurality of sites is performed responsive to a determination that theweight randomly associated with the data element received less than theglobal value stored at the coordinator
 7. The method of claim 6, whereinthe updating the global value stored at the coordinator to the weight ofthe data element received is performed responsive to a determinationthat the weight randomly associated with the data element received isgreater than the global value stored at the coordinator.
 8. The methodof claim 1, further comprising: at each site: receiving a plurality ofdata elements from a data stream; associating a random weight to each ofthe plurality of elements; and responsive to a data element beingassociated with a random weight below the locally stored global value ata receiving site, sending the data element to the coordinator.
 9. Themethod of claim 1, further comprising storing, at the coordinator, atleast one received data element from the plurality of sites.
 10. Themethod of claim 1, further comprising estimating a number of distinctelements using the global value stored at the coordinator.
 11. Themethod of claim 1, further comprising: receiving a query at thecoordinator; and answering said query using a random sample based on atleast one data element stored at the coordinator.
 12. The method ofclaim 1, wherein said plurality of sites are network routers, andfurther wherein said network routers are connected to said coordinatorvia at least one network connection.
 13. A computer program product fordistributed sampling on a network with a plurality of sites and acoordinator, comprising: a computer readable storage medium havingcomputer readable program code embodied therewith, the computer readableprogram code comprising: computer readable program code configured toreceive, at the coordinator, a data element from a site of the pluralityof sites, said data element having a weight randomly associatedtherewith deemed reportable by comparison at the site to a locallystored global value; computer readable program code configured tocompare the weight of the data element received with a global valuestored at the coordinator; and computer readable program code configuredto perform one of: updating the global value stored at the coordinatorto the weight of the data element received; and communicating the globalvalue stored at the coordinator back to the site of the plurality ofsites.
 14. The computer program product of claim 13, wherein thecomputer readable program code configured to communicate the globalvalue stored at the coordinator back to the site of the plurality ofsites is further configured to communicate the global value responsiveto a determination that the weight randomly associated with the dataelement received is greater than the global value stored at thecoordinator.
 15. The computer program product of claim 14, whereincomputer readable program code configured to update the global valuestored at the coordinator to the weight of the data element received isfurther configured to update the global value stored at the coordinatorresponsive to a determination that the weight randomly associated withthe data element received is less than the global value stored at thecoordinator.
 16. The computer program product of claim 13, wherein thecomputer readable program code configured to communicate the globalvalue stored at the coordinator back to the site of the plurality ofsites is further configured to communicate the global value responsiveto a determination that the weight randomly associated with the dataelement received less than the global value stored at the coordinator17. The computer program product of claim 16, wherein the computerreadable program code configured to update the global value stored atthe coordinator to the weight of the data element received is furtherconfigured to update the global value at the coordinator responsive to adetermination that the weight randomly associated with the data elementreceived is greater than the global value stored at the coordinator. 18.The computer program product of claim 13, further comprising computerreadable program code configured to: store, at the coordinator, at leastone received data element from the plurality of sites; receive a queryat the coordinator; and answer said query using a random sample based onat least one data element stored at the coordinator.
 19. The computerprogram product of claim 13, wherein said plurality of sites are networkrouters, and further wherein said network routers are connected to saidcoordinator via at least one network connection.
 20. A systemcomprising: at least one processor; and a memory device operativelyconnected to the at least one processor; wherein, responsive toexecution of program instructions accessible to the at least oneprocessor, the at least one processor is configured to: receive at thecoordinator a data element from a site of the plurality of sites, saiddata element having a weight randomly associated therewith deemedreportable by comparison at the site to a locally stored global value;compare the weight of the data element received with a global valuestored at the coordinator; and perform one of: updating the global valuestored at the coordinator to the weight of the data element received;and communicating the global value stored at the coordinator back to thesite of the plurality of sites.